Pearl shape classification using deep convolutional neural networks from Tahitian pearl rotation in Pinctada margaritifera

Tahitian pearls, artificially cultivated from the black-lipped pearl oyster Pinctada margaritifera, are renowned for their unique color and large size, making the pearl industry vital for the French Polynesian economy. Understanding the mechanisms of pearl formation is essential for enabling quality and sustainable production. In this paper, we explore the process of pearl formation by studying pearl rotation. Here we show, using a deep convolutional neural network, a direct link between the rotation of the pearl during its formation in the oyster and its final shape. We propose a new method for non-invasive pearl monitoring and a model for predicting the final shape of the pearl from rotation data with 81.9% accuracy. These novel resources provide a fresh perspective to study and enhance our comprehension of the overall mechanism of pearl formation, with potential long-term applications for improving pearl production and quality control in the industry.

: Table including all the information associated with each oyster: the date of grafting, the start and end dates of acquisition, as well as the final shape attributed. Note that the end date of acquisition also corresponds to the date of sacrifice for the oyster. Figure 1: Photographs of all the pearls that were processed by our device are included, except for the 7 pearls that were lost between the characterization and photography stages.      Table 6: Performance results from various models on our dataset. The best accuracy achieved after optimization is reported, with each accuracy score computed on the test set and represented as the mean of randomized batches. Special care is taken to split oysters into different sets to avoid overfitting. The BioDiscML [22] tool was utilized to calculate all models, with the exception of the LSTM model, which was manually coded and optimized in Python. Bidirectional LSTM model was also tried, giving the same results as LSTM model. This table gives an overview of all the results obtained, but it does not include all the models tested. Details about the parametrization of these models can be found at https://github.com/mickaelleclercq/BioDiscML/blob/master/classifiers.conf 3. Max-pooling layers: VGG-16 includes 5 max-pooling layers, each using a 2x2 kernel with a stride of 2. These layers help to reduce the spatial dimensions of the feature maps, thereby decreasing computational complexity and capturing translation-invariant features. They are interspersed between the convolutional layers.
4. Fully connected layers: After the convolutional and max-pooling layers, VGG-16 has 3 fully connected (dense) layers. The first two dense layers have 4096 units each, followed by three layers with decreasing unit sizes of 1000, 512, and 256, respectively. The size of the final dense layer depends on the number of output classes K in the classification task, which is 3 in our case. These layers integrate the high-level features extracted by the convolutional layers to make the final classification decision. Each fully connected layer is followed by a 50% dropout layer to prevent overfitting.

Activation functions:
The VGG-16 architecture employs the Rectified Linear Unit (ReLU) activation function throughout the network, except for the final dense layer, where a SoftMax activation function is utilized to output class probabilities.
6. Output layer: The VGG-16 network's output layer provides the class probabilities, and the class with the highest probability is chosen as the final classification result.
7. Output: Training, validation, and test sets are created from filtered, consistent data with added metadata (time elapsed since grafting for acquisition and sacrifice). After filtering outliers due to various acquisition issues, a total of 218 samples from 47 different pearls were retained.
8. Input: Load a pre-trained VGG16 model with custom layers, using weights pre-trained on the ImageNet [20] dataset.
9. Training of the model on our datasets to fine-tune the weights for our images.
10. Hyperparameter optimization using grid search on batch size, learning rate, epochs, and regularization. Evaluation of different models from accuracy and f1-score.
11. Selection of the best model for shape classification on our pearls 12. Input: New rotation data transformed into images as in the training set to be used on our trained model.
13. Output: Final shape prediction of the pearl from new rotation data.
Supplementary Note 4: Detailed description of each block from the feature extraction process, from Fig. 6.
1. Block1-conv1: This is the first layer of the VGG-16 model that detects low-level features such as edges and textures.
2. Block2-conv1: This is the first layer of the second block, where the model starts recognizing more complex patterns like corners and simple shapes.
3. Block3-conv1: The first layer of the third block captures higher-level features, such as object parts and more complex shapes.
4. Block4-Conv1: In the first layer of the fourth block, the model identifies even more abstract features, like parts of objects or specific textures related to the objects in the image.
5. Block5-conv1: The first layer of the fifth block represents the highest level of abstraction in the VGG-16 model. At this point, the model has captured more complex features and patterns that help it differentiate between various objects and scenes. Visualizing this layer offers insights into the model's ability to recognize high-level visual concepts.